445 research outputs found

    Adding complexity to complexity: Gene family evolution in polyploids

    Get PDF
    Comparative genomics of non-model organisms has resurrected whole genome duplication (WGD) from being viewed as a somewhat obscure process that happens in plants to a primary driver of eukaryotic diversification. The shadow of past ploidy increases has left a strong signature of duplicated genes organized into gene families, even in small genomes that have undergone effectively complete rediploidization. Nevertheless, despite continually advancing technologies and bioinformatics pipelines, resolving the fate of duplicate genes remains a substantial challenge. For example, many important recognition processes are driven not only by allelic expansion through retention of duplicates but also by diversification and copy number variation. This creates technical difficulties with assembly to reference genomes and accurate interpretation of homology. Thus, relatively little is known about the impacts of recent polyploidization and hybridization on the evolution of gene families under selective forces that maintain diversity, such as balancing selection. Here we use a complex of species and ploidy levels in the genus Arabidopsis (A. lyrata and A. arenosa) as a model to investigate the evolutionary dynamics of a large and complicated gene family known to be under strong balancing selection: the receptor-like kinases, which include the female component of genetically controlled self-incompatibility. Specifically, we question: (1) How does diversity of S-receptor kinase (SRK) alleles in tetraploids compare to that in their close diploid relatives? (2) Is there increased trans-specific polymorphism (i.e., sharing of alleles that transcend speciation, characteristic of balancing selection) in tetraploids compared to diploids due to the higher number of copies they carry? (3) Do these highly variable loci show evidence of introgression among extant species/ploidy levels within or outside known zones of hybridization? (4) Is there evidence for copy number variation among paralogs? We use this example to highlight specific issues to consider when interpreting gene family evolution, particularly in relation to polyploids but also more generally in diploids. We conclude with recommendations for strategies to address the challenges of resolving such complex loci in the future, using advances in deep sequencing approaches

    Enabling comparative modeling of closely related genomes: Example genus Brucella

    Get PDF
    For many scientific applications, it is highly desirable to be able to compare metabolic models of closely related genomes. In this short report, we attempt to raise awareness to the fact that taking annotated genomes from public repositories and using them for metabolic model reconstructions is far from being trivial due to annotation inconsistencies. We are proposing a protocol for comparative analysis of metabolic models on closely related genomes, using fifteen strains of genus Brucella, which contains pathogens of both humans and livestock. This study lead to the identification and subsequent correction of inconsistent annotations in the SEED database, as well as the identification of 31 biochemical reactions that are common to Brucella, which are not originally identified by automated metabolic reconstructions. We are currently implementing this protocol for improving automated annotations within the SEED database and these improvements have been propagated into PATRIC, Model-SEED, KBase and RAST. This method is an enabling step for the future creation of consistent annotation systems and high-quality model reconstructions that will support in predicting accurate phenotypes such as pathogenicity, media requirements or type of respiration.We thank Jean Jacques Letesson, Maite Iriarte, Stephan Kohler and David O'Callaghan for their input on improving specific annotations. This project has been funded by the United States National Institute of Allergy and Infectious Diseases, National Institutes of Health, Department of Health and Human Services, under Contract No. HHSN272200900040C, awarded to BW Sobral, and from the United States National Science Foundation under Grant MCB-1153357, awarded to CS Henry. J.P.F. acknowledges funding from [FRH/BD/70824/2010] of the FCT (Portuguese Foundation for Science and Technology) Ph.D. scholarship

    Draft Genome Sequence of the Marine Streptomyces sp. Strain PP-C42, Isolated from the Baltic Sea

    Get PDF
    Streptomyces, a branch of aerobic Gram-positive bacteria represents the largest genus of actinobacteria. The streptomycetes are characterized by a complex secondary metabolism and produce over two-thirds of the clinically used natural antibiotics today. Here we report the draft genome sequence of a Streptomyces strain PP-C42 isolated from the marine environment. A subset of unique genes and gene clusters for diverse secondary metabolites as well as antimicrobial peptides (AMPs) could be identified from the genome, showing great promise as a source for novel bioactive compound

    Draft Genome Sequence of the Marine Streptomyces sp. Strain PP-C42, Isolated from the Baltic Sea

    Get PDF
    Streptomyces, a branch of aerobic Gram-positive bacteria represents the largest genus of actinobacteria. The streptomycetes are characterized by a complex secondary metabolism and produce over two-thirds of the clinically used natural antibiotics today. Here we report the draft genome sequence of a Streptomyces strain PP-C42 isolated from the marine environment. A subset of unique genes and gene clusters for diverse secondary metabolites as well as antimicrobial peptides (AMPs) could be identified from the genome, showing great promise as a source for novel bioactive compound

    Phylogeography and host specificity of Pasteurellaceae pathogenic to sea-farmed fish in the north-east Atlantic

    Get PDF
    The present study was undertaken to address the recent spate of pasteurellosis outbreaks among sea-farmed Atlantic salmon (Salmo salar) in Norway and Scotland, coinciding with sporadic disease episodes in lumpfish (Cyclopterus lumpus) used for delousing purposes in salmon farms. Genome assemblies from 86 bacterial isolates cultured from diseased salmon or lumpfish confirmed them all as bona fide members of the Pasteurellaceae family, with phylogenetic reconstruction dividing them into two distinct branches sharing <88% average nucleotide identity. These branches therefore constitute two separate species, namely Pasteurella skyensis and the as-yet invalidly named “Pasteurella atlantica”. Both species further stratify into multiple discrete genomovars (gv.) and/or lineages, each being nearly or fully exclusive to a particular host, geographic region, and/or time period. Pasteurellosis in lumpfish is, irrespective of spatiotemporal origin, linked almost exclusively to the highly conserved “P. atlantica gv. cyclopteri” (Pac). In contrast, pasteurellosis in Norwegian sea-farmed salmon, dominated since the late-1980s by “P. atlantica gv. salmonicida” (Pas), first saw three specific lineages (Pas-1, -2, and -3) causing separate, geographically restricted, and short-lived outbreaks, before a fourth (Pas-4) emerged recently and became more widely disseminated. A similar situation involving P. skyensis (Ps) has apparently been unfolding in Scottish salmon farming since the mid-1990s, where two historic (Ps-1 and -2) and one contemporary (Ps-3) lineages have been recorded. While the epidemiology underlying all these outbreaks/epizootics remains unclear, repeated detection of 16S rRNA gene amplicons very closely related to P. skyensis and “P. atlantica” from at least five cetacean species worldwide raises the question as to whether marine mammals may play a part, possibly as reservoirs. In fact, the close relationship between the studied isolates and Phocoenobacter uteri associated with harbor porpoise (Phocoena phocoena), and their relatively distant relationship with other members of the genus Pasteurella, suggests that both P. skyensis and “P. atlantica” should be moved to the genus Phocoenobacter

    Origin of Saxitoxin Biosynthetic Genes in Cyanobacteria

    Get PDF
    BACKGROUND:Paralytic shellfish poisoning (PSP) is a potentially fatal syndrome associated with the consumption of shellfish that have accumulated saxitoxin (STX). STX is produced by microscopic marine dinoflagellate algae. Little is known about the origin and spread of saxitoxin genes in these under-studied eukaryotes. Fortuitously, some freshwater cyanobacteria also produce STX, providing an ideal model for studying its biosynthesis. Here we focus on saxitoxin-producing cyanobacteria and their non-toxic sisters to elucidate the origin of genes involved in the putative STX biosynthetic pathway. METHODOLOGY/PRINCIPAL FINDINGS:We generated a draft genome assembly of the saxitoxin-producing (STX+) cyanobacterium Anabaena circinalis ACBU02 and searched for 26 candidate saxitoxin-genes (named sxtA to sxtZ) that were recently identified in the toxic strain Cylindrospermopsis raciborskii T3. We also generated a draft assembly of the non-toxic (STX-) sister Anabaena circinalis ACFR02 to aid the identification of saxitoxin-specific genes. Comparative phylogenomic analyses revealed that nine putative STX genes were horizontally transferred from non-cyanobacterial sources, whereas one key gene (sxtA) originated in STX+ cyanobacteria via two independent horizontal transfers followed by fusion. In total, of the 26 candidate saxitoxin-genes, 13 are of cyanobacterial provenance and are monophyletic among the STX+ taxa, four are shared amongst STX+ and STX-cyanobacteria, and the remaining nine genes are specific to STX+ cyanobacteria. CONCLUSIONS/SIGNIFICANCE:Our results provide evidence that the assembly of STX genes in ACBU02 involved multiple HGT events from different sources followed presumably by coordination of the expression of foreign and native genes in the common ancestor of STX+ cyanobacteria. The ability to produce saxitoxin was subsequently lost multiple independent times resulting in a nested relationship of STX+ and STX- strains among Anabaena circinalis strains

    An Introduction to RNA Databases

    Full text link
    We present an introduction to RNA databases. The history and technology behind RNA databases is briefly discussed. We examine differing methods of data collection and curation, and discuss their impact on both the scope and accuracy of the resulting databases. Finally, we demonstrate these principals through detailed examination of four leading RNA databases: Noncode, miRBase, Rfam, and SILVA.Comment: 27 pages, 10 figures, 1 tables. Submitted as a chapter for "An introduction to RNA bioinformatics" to be published by "Methods in Molecular Biology

    Defining the Pseudomonas Genus: Where Do We Draw the Line with Azotobacter?

    Get PDF
    The genus Pseudomonas has gone through many taxonomic revisions over the past 100 years, going from a very large and diverse group of bacteria to a smaller, more refined and ordered list having specific properties. The relationship of the Pseudomonas genus to Azotobacter vinelandii is examined using three genomic sequence-based methods. First, using 16S rRNA trees, it is shown that A. vinelandii groups within the Pseudomonas close to Pseudomonas aeruginosa. Genomes from other related organisms (Acinetobacter, Psychrobacter, and Cellvibrio) are outside the Pseudomonas cluster. Second, pan genome family trees based on conserved gene families also show A. vinelandii to be more closely related to Pseudomonas than other related organisms. Third, exhaustive BLAST comparisons demonstrate that the fraction of shared genes between A. vinelandii and Pseudomonas genomes is similar to that of Pseudomonas species with each other. The results of these different methods point to a high similarity between A. vinelandii and the Pseudomonas genus, suggesting that Azotobacter might actually be a Pseudomonas

    De novo assembly of Euphorbia fischeriana root transcriptome identifies prostratin pathway related genes

    Get PDF
    Background Euphorbia fischeriana is an important medicinal plant found in Northeast China. The plant roots contain many medicinal compounds including 12-deoxyphorbol-13-acetate, commonly known as prostratin that is a phorbol ester from the tigliane diterpene series. Prostratin is a protein kinase C activator and is effective in the treatment of Human Immunodeficiency Virus (HIV) by acting as a latent HIV activator. Latent HIV is currently the biggest limitation for viral eradication. The aim of this study was to sequence, assemble and annotate the E. fischeriana transcriptome to better understand the potential biochemical pathways leading to the synthesis of prostratin and other related diterpene compounds. Results In this study we conducted a high throughput RNA-seq approach to sequence the root transcriptome of E. fischeriana. We assembled 18,180 transcripts, of these the majority encoded protein-coding genes and only 17 transcripts corresponded to known RNA genes. Interestingly, we identified 5,956 protein-coding transcripts with high similarity (>=75%) to Ricinus communis, a close relative to E. fischeriana. We also evaluated the conservation of E. fischeriana genes against EST datasets from the Euphorbeacea family, which included R. communis, Hevea brasiliensis and Euphorbia esula. We identified a core set of 1,145 gene clusters conserved in all four species and 1,487 E. fischeriana paralogous genes. Furthermore, we screened E. fischeriana transcripts against an in-house reference database for genes implicated in the biosynthesis of upstream precursors to prostratin. This identified 24 and 9 candidate transcripts involved in the terpenoid and diterpenoid biosyntehsis pathways, respectively. The majority of the candidate genes in these pathways presented relatively low expression levels except for 1-hydroxy-2-methyl-2-(E)-butenyl 4-diphosphate synthase (HDS) and isopentenyl diphosphate/dimethylallyl diphosphate synthase (IDS), which are required for multiple downstream pathways including synthesis of casbene, a proposed precursor to prostratin. Conclusion The resources generated in this study provide new insights into the upstream pathways to the synthesis of prostratin and will likely facilitate functional studies aiming to produce larger quantities of this compound for HIV research and/or treatment of patients
    corecore